Decision-Level Fusion for Audio-Visual Laughter Detection
نویسندگان
چکیده
Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laughter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audiovisual laughter detection is performed by fusing the results of separate audio and video classifiers on the decision level. This results in laughter detection with a significantly higher AUC-ROC than single-modality classification.
منابع مشابه
Fusion for Audio-Visual Laughter Detection
Laughter is a highly variable signal, and can express a spectrum of emotions. This makes the automatic detection of laughter a challenging but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is performed by combining (fusing) the results of a separate audio and video classifier on the decision level. ...
متن کاملFeature-Level Decision Fusion for Audio-Visual Word Prominence Detection
Common fusion techniques in audio-visual speech processing operate on the modality level. I.e. they either combine the features extracted from the two modalities directly or derive a decision for each modality separately and then combine the modalities on the decision level. We investigate the audio-visual processing of linguistic prosody, more precisely the extraction of word prominence. In th...
متن کاملComparison of single-model and multiple-model prediction-based audiovisual fusion
Prediction-based fusion is a recently proposed audiovisual fusion approach which outperforms feature-level fusion on laughter-vs-speech discrimination. One set of predictive models is trained per class which learns the audio-to-visual and visual-to-audio feature mapping together with the time evolution of audio and visual features. Classification of a new input is performed via prediction. All ...
متن کاملCombining acoustic and visual features to detect laughter in adults' speech
Laughter can not only convey the affective state of the speaker but also be perceived differently based on the context in which it is used. In this paper, we focus on detecting laughter in adults’ speech using the MAHNOB laughter database. The paper explores the use of novel long-term acoustic features to capture the periodic nature of laughter and the use of computer vision-based smile feature...
متن کاملAudio-visual Laughter Synthesis System
In this paper we propose an overview of a project aiming at building an audio-visual laughter synthesis system. The same approach is followed for acoustic and visual synthesis. First a database has been built to have synchronous audio and 3D visual landmarks tracking data. Then this data has been used to build HMM models of acoustic laughter and visual laughter separately. Visual laughter model...
متن کامل